© John Wiley & Sons, Inc.
FIGURE 19-10: The relationship between age and hormone concentration doesn’t conform to a simple function.
In Figure 19-10, you can observe a lot of scatter in these points, which makes it hard to see the more
subtle aspects of the XYZ-age relationship. At what age does the hormone level start to rise? When
does it peak? Does it remain fairly constant throughout child-bearing years? When does it start to
decline? Is the rate of decline after menopause constant, or does it change with advancing age?
It would be easier to answer those questions if you had a curve that represented the data without all the
random fluctuations of the individual points. How would you go about fitting such a curve to these
data? LOWESS to the rescue!
Running LOWESS regression in R is similar to other regression. You need to tell R which variable
represents x and which one represents y, and it does the rest. If your variables in R are actually named
x and y, the R instruction to run a LOWESS regression is the following: lowess(
). (We
explain the
part in the following section.)
Unlike other forms of regression, LOWESS doesn’t produce a coefficients table. The only output is a
table of smoothed y values, one for each data point, which you can save as a data file. Next, using
other R commands, you can plot the x and y points from your data, and add a smoothed line
superimposed on the scatter graph based on the smoothed y values. Figure 19-11 shows this plot.